Data augmentation approaches in natural language processing: A survey

نویسندگان

چکیده

As an effective strategy, data augmentation (DA) alleviates scarcity scenarios where deep learning techniques may fail. It is widely applied in computer vision then introduced to natural language processing and achieves improvements many tasks. One of the main focuses DA methods improve diversity training data, thereby helping model better generalize unseen testing data. In this survey, we frame into three categories based on augmented including paraphrasing, noising, sampling. Our paper sets out analyze detail according above categories. Further, also introduce their applications NLP tasks as well challenges. Some helpful resources are provided appendix.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Survey on Statistical Approaches to Natural Language Processing

This survey attempts to catch up with the recent increasing interests in statistical approach to natural language processing based on large corpora. First of all, a historical overview traces back to 1950s when Noam Chomsky proposed his phrase structure transformation grammar and rejected the Markov process natural language modeling. With the development of large corpora and language modeling i...

متن کامل

Natural Language Processing - A Survey

1 "Computer, would you search the Web for references to our company and determine the 3 most common complaints mentioned about us?" "Sure, Kevin." "Then summarize and format your findings and insert it into the second page of my slide presentation." "No problem." "And also, could you give me a reminder notice a half hour before my plane is supposed to leave?"

متن کامل

Neuro-Fuzzy Approaches to Natural Language Data Processing

Dubbed the ‘age of permissiveness’ by traditional mathematicians, philosophers and linguists, the period since the late 1960ies has marked a fundamental change in the estimation of logical exactness of the type ‘A or not-A’. Since then, the realm of Aristotle, the cultural exclusiveness of binary logic, has been seriously challenged by Fuzzy Logic. Fuzzy sets, vagueness, multivalence and multiv...

متن کامل

Bayesian approaches in Natural Language Processing

This paper overviews Bayesian approaches in natural language processing that are becoming prominent. Without any knowledge of natural language processing, Bayesian approaches to both discriminative learning and generative modeling are described. Especially, näıve bayes and its full unsupervised Bayesian modeling, DM, and LDA are developed. These Bayesian approaches permit interesting joint mode...

متن کامل

Similarity-Based Approaches to Natural Language Processing

Statistical methods for automatically extracting information about associations between words or documents from large collections of text have the potential to have considerable impact in a number of areas, such as information retrieval and natural-language-based user interfaces. However, even huge bodies of text yield highly unreliable estimates of the probability of relatively common events, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: AI open

سال: 2022

ISSN: ['2666-6510']

DOI: https://doi.org/10.1016/j.aiopen.2022.03.001